Flink job on EMR runs only on one TaskManager - hadoop-yarn

I am running EMR cluster with 3 m5.xlarge nodes (1 master, 2 core) and Flink 1.8 installed (emr-5.24.1).
On master node I start a Flink session within YARN cluster using the following command:
flink-yarn-session -s 4 -jm 12288m -tm 12288m
That is the maximum memory and slots per TaskManager that YARN let me set up based on selected instance types.
During startup there is a log:
org.apache.flink.yarn.AbstractYarnClusterDescriptor - Cluster specification: ClusterSpecification{masterMemoryMB=12288, taskManagerMemoryMB=12288, numberTaskManagers=1, slotsPerTaskManager=4}
This shows that there is only one task manager. Also when looking at YARN Node manager I see that there is only one container running on one of the core nodes. YARN Resource manager shows that the application is using only 50% of cluster.
With the current setup I would assume that I can run Flink job with parallelism set to 8 (2 TaskManagers * 4 slots), but in case that submitted job has set parallelism to more than 4, it fails after a while as it could not get desired resources.
In case the job parallelism is set to 4 (or less), the job runs as it should. Looking at CPU and memory utilisation with Ganglia it shows that only one node is utilised, while the other flat.
Why is application run only on one node and how to utilise the other node as well? Did I need to set up something on YARN that it would set up Flink on the other node as well?
In previous version of Flik there was startup option -n which was used to specify number of task managers. The option is now obsolete.

When you're starting a 'Session Cluster', you should see only one container which is used for the Flink Job Manager. This is probably what you see in the YARN Resource Manager. Additional containers will automatically be allocated for Task Managers, once you submit a job.
How many cores do you see available in the Resource Manager UI?
Don't forget that the Job Manager also uses cores out of the available 8.
You need to do a little "Math" here.
For example, if you would have set the number of slots to 2 per TM and less memory per TM, then submitted a job with parallelism of 6 it should have worked with 3 TMs.

Related

Aerospike migrations issue

We have a 10 node cluster with version 3.9 running with cold-start-empty false, in which we did the following activity :
Added a node 10.0.29.212 with version community build 3.13.0.10
Waited for migrations to finish (new cluster size 11). There were incoming Migrations on only 10.0.29.212 node as expected.
Added 2 nodes 10.0.29.190 , 10.0.29.135 simultaneously with version community build 3.13.0.10.
Waited for migrations to finish (new cluster size 13).Incoming Migrations on only these two nodes node as expected.
Added a node 10.0.29.214 after few hours with version community build 3.13.0.10.
Immediately after the node was added , the total master objects in the cluster dropped and incoming migrations started on all nodes and we started getting timeouts on cluster.

Jenkins - How to run two Jobs parallelly (1 FT Jobs and 1 Selenium Jobs) on same slave node

I want to run two jobs parallelly on the same Slave.
Job 1 is Functional Testing jobs doesn't require Browser and Job 2 is Selenium Job which requires Browser for testing.
As for running the job on the same slave, you can use the option Restrict where this project can be run, assuming you have the jenkins slave configured in your setup.
For running the jobs in parallel (are you trying to do this via Jenkinsfile or via freestyle jobs?). For jenkinsfile, you can use the parallel stages feature as described here. For freestyle jobs, I would suggest adding one more job (for example setup job) and use this job to trigger your two jobs at the same time. Here are few screenshots showing one of my pipeline triggering jobs in parallel.

When restart Hadoop Yarn does that affect the running MapReduce Jobsthat

A few times I restarted Yarn when there were running MapReduce jobs on it, but I found that the running MR jobs were not affected. i.e after restarting Yarn, the MR jobs could be resumed immediately, I was wondering that why didn't the MR jobs failed? btw, all of my MapReduce jobs were Pig script job.
The link provides the HA(High availability) architecture of YARN's Resource Manager.
In your case , I believe Automatic fail over is enabled ,so when the Resource Manager goes down , another RM is automatically elected to be the active.

In what mode is Hive installed?

Does hive installation have any specific mode?
Like for example, Hadoop installation has 3 modes: standalone, pseudo-distributed and fully distributed.
Similarly does Hive has any specific type of distribution?
Can Hive be installed in distributed mode?
Hive actually provides you the option to run queries in 2 modes :
1- Map-Reduce mode
2- Local mode
Normally Hive compiler generates map-reduce jobs for most queries under the hood. These jobs are then submitted to the Map-Reduce cluster indicated by the variable:
mapred.job.tracker
While this usually points to a map-reduce cluster with multiple nodes, Hadoop also provided you the ability to run map-reduce jobs locally on the your standalone workstation. In order to run Hive queries in local mode you need to do this :
hive> SET mapred.job.tracker=local;
Details can be found here.

JBoss Cluster setup with Hudson?

I want to have a Hudson setup that has two cluster nodes with JBoss. There is already a test machine with Hudson and it is running the nightly build and tests. At the moment the application is deployed on the Hudson box.
There are couple options in my mind. One could be to use SCPplugin for Hudson to copy the ear file over from master to the cluster nodes. The other option could be to setup Hudson slaves on cluster nodes.
Any opinions, experiences or other approaches?
edit: I set up a slave but it seems that I can't make a job to take place on more than one slave without copying the job. Am I missing something?
You are right. You can't run different build steps of one job on different nodes. However, a job can be configured to run on different slaves, Hudson than determines at execution time what node that job will run on.
You need to configure labels for you nodes. A node can have more than one label. Every job can also require more than one label.
Example:
Node 1 has label maven and db2
Node 2 has label maven and ant
Job 1 requires label maven
can run on Node 1 and Node 2
Job 2 requires label ant
can run on Node 2
Job 2 requires label maven and db2
can run on Node 1
If you need different build steps of one job to run on different nodes you have to create more than one job and chain them. You only trigger the first job who triggers the subsequent jobs. One of the following jobs can access the artifacts of the previous job. You can even run two jobs in parallel and when both are done automatically trigger the next job. You will need the Join Plugin for the parallel jobs.
If you want load balancing and central administration from Hudson (i.e. configuring projects, seeing what builds run ATM, etc.), you must run slaves.